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Abstract 

In this paper, we propose a greedy user selection with swap (GUSS) algorithm based on zero-forcing 
beamforming (ZFBF) for the multi-user multiple-input multiple-output (MIMO) downlink channels. 
Since existing user selection algorithms, such as the zero-forcing with selection (ZFS), have 'redundant 
user' and 'local optimum' flaws that compromise the achieved sum rate, GUSS adds 'delete' and 'swap' 
operations to the user selection procedure of ZFS to improve the performance by eliminating 'redundant 
user' and escaping from 'local optimum', respectively. In addition, an effective channel vector based 
effective-channel-gain updating scheme is presented to reduce the complexity of GUSS. With the help 
of this updating scheme, GUSS has the same order of complexity of ZFS with only a linear increment. 
Simulation results indicate that on average GUSS achieves 99.3 percent of the sum rate upper bound 
that is achieved by exhaustive search, over the range of transmit signal-to-noise ratios considered with 
only three to six times the complexity of ZFS. 
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I. Introduction 

Multi-user multiple-input multiple-output (MIMO) communication, where a multi-antenna 
base station (BS) communicates with multiple users simultaneously, is a key technology to 
provide high throughput for future wireless communication systems JT). In this scenario, the 
BS is usually equipped with more antennas than that supporting single user communication, 
due to the equipment size, power supply and computation capacity factors. Consequently, it can 
transmit different data streams to multiple users simultaneously in the downlink to exploit the 
extra spatial degrees of freedom. A fundamental problem arising in this scenario is how the 
BS should choose a subset of users for transmissions in order to maximize the total throughput 
0-0. 

The choice of the best user subset S best depends on the precoding method adopted in the BS. 
Even though dirty paper coding (DPC) [6] is the optimal scheme in the sense that DPC achieves 
the capacity of MIMO broadcast channel iTTl- lfTOll . it is difficult to implement it in practical 
systems due to its high computational complexity. We consider in this paper a practical low 
complexity scheme termed as zero-forcing beamforming (ZFBF) [fTT ] ] — [TTSll . which completely 
removes the interference by inverting the channel matrix at the transmitter. The number of users 
that BS can communicate with simultaneously is equal to or less than the number of antennas 
in BS when the ZFBF precoding is adopted. 

Determining S best for the multi-user MIMO downlink with ZFBF requires a brute-force 
exhaustive search over all possible user sets, and the complexity of an exhaustive search is 
prohibitive when the number of users is large. Thus, several suboptimal greedy user selection 
algorithms have been designed in the past. Generally, these algorithms fall into two categories: 
capacity-based algorithm and Frobenius norm-based algorithm. The capacity-based algorithm, 
represented by the zero forcing with selection (ZFS) algorithm proposed by Dimic et al. 0, 
chooses users greedily based on the accurate sum rate variation. It chooses the first user with the 
highest channel capacity and then finds the next user that provides the maximum sum rate from 
the remaining unselected users. Based on ZFS, Wang et al. proposed a sequential water- filling 
user selection (SWF) algorithm to improve the achieved sum rate performance by eliminating 
users allocated with zero transmit power after ZFS user selection [5J. The Frobenius norm-based 
algorithm, represented by the semi- orthogonal user selection (SUS) algorithm proposed by Yoo 
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et al. [3 1, chooses users greedily based on the approximate sum rate variations with respect to 
channel norm related parameters. SUS adds the new user with the largest effective channel norm 
nearly orthogonal to the selected users in each iteration. Along this line, Akhlaghi et al. proposed 
a greedy algorithm based on maximizing the determinant of the composite channel matrix [fl6l , 
and Jin et al. proposed a capacity-based algorithm maximizing the product of diagonal elements 
of the upper- triangular matrix R after performing QR factorization to the channel matrix [fTTl . 
The Frobenius norm-based algorithms have lower complexity by eliminating the calculation of 
sum rate, but pay a price in sum rate performance by not guaranteeing a positive sum rate 
increment in the user selection process. 

Two main flaws exist in previous greedy search user selection algorithms: 

• Redundant users exist in selected user set; 

• The selected user set might be trapped in a local optimum. 

A 'redundant user' is defined as a user who can be deleted from the selected user set to 
yield an increase in the sum rate. Existence of redundant users is an inherent flaw of greedy 
incremental algorithms since the accumulated user selection procedure would make some former 
selected users undesirable. This phenomenon has been identified in [2] and [51 that redundant 
users exist when some users are assigned with zero transmit power after waterfilling power 
allocation, and solved by deleting the user with zero transmit power. However, as we will prove 
in Sections Hn] [2] and [5] were incorrect in both identifying and handling the redundant users, 
which may exist even though all users are allocated with positive power and it may not achieve 
the maximum sum rate increment by deleting users with zero power. 

Since user selection is a combinatorial optimization problem, the achieved user set of previous 
algorithms may be trapped in a local optimum where the sum rate cannot be increased by adding 
a new user or deleting a selected user. However, the sum rate can be increased by swapping 
users between the selected user set and the candidate users. After leaving the local optimum by 
a 'swap' operation, the 'add' and 'delete' operation can be utilized further to increase the sum 
rate. 

The main contributions of this paper are as follows: 

1. We propose a user selection algorithm with high throughput and low complexity 
In this paper, we propose a new user selection algorithm, named greedy user selection with 



4 



swap (GUSS), which introduces 'add', 'delete' and 'swap' operations in the user selection 
procedure to increase the sum rate. GUSS eliminates all the redundant users through the 'delete' 
operation and escapes from local optima through the 'swap' operation. 
2. We present an efficient effective-channel-gain updating strategy to reduce the complexity 
of GUSS 

To avoid expensive matrix inversion involved in updating the sum rate, we design an efficient 
effective-channel-gain updating method that replaces matrix inversion with less expensive vector- 
vector multiplication. Previous complexity reduction methods, such as those proposed for ZFS 
and SWF, are only suitable for incremental user set update while deleting or swapping users 
cannot be supported. Our method provides the same low complexity for 'add', 'delete' and 
'swap' operations. 

The remainder of this paper is organized as follows. In Section HH we describe the system 
model and formulate the user selection problem in multi-user MIMO downlink with ZFBF. 
The two flaws in existing user selection algorithms are explored in Sections [Till In Section [IV] 
the effective-channel-gain updating method for 'add', 'delete' and 'swap' operation is derived. 
In Sections |Vl the GUSS algorithm is presented. The sum rate performance and complexity 
of GUSS are evaluated and compared with previous user selection algorithms in Sections |VT1 
Section IVIII concludes the paper. 

II. System Model and Problem Formulation 

A. Notation 

We use uppercase boldface letters for matrices and lowercase boldface for vectors. E{-} 
stands for the expectation operator, H* (h*) stands for the conjugate transpose of a matrix H 
(vector h), and \S\ denotes the cardinality of a user set S. ||h|| denotes the Euclidean vector 
norm that ||h|| = \/hh* when h is a row vector. denotes the Moore-Penrose pseudo-inverse 
H t = H*(HH*) _1 . Si \ S 2 denotes set difference that deletes the elements of S 2 from Si. 

B. System Model 

Consider a single cell MIMO downlink channel with M transmit antennas at the base station 
(BS) serving K single antenna users. Assume a quasi-static flat-fading channel between the BS 
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and the users, and h k , m represents the complex channel gain from transmit antenna m to user 
k. Thus, the received signal y k at user k is determined by 

y k = h k x + n k (1) 

for k — 1, • • • , K, where x G c A/xl is the transmitted signal vector, h k = [h kjl ■ ■ ■ H^m] £ 
C lxM is the channel vector of user k, and n k is the white Gaussian noise with zero mean and unit 
variance. H = [h*, • • • , h* K ]* G C KxM is the channel matrix of all users, the entries of H are 
modeled as a set of i.i.d. zero-mean circularly symmetric complex Gaussian random variables and 
the BS is assumed to have full knowledge of H. The power constraint for the transmitted signal 
is E{x*x} < P. Since the noise has unit variance, P also means total transmit signal-to-noise 
ratio (SNR) 0. 

The BS supports up to M users simultaneously when using linear beamforming transmission. 
Denote the index set of served users as S = {ir(l),--- ,n(k)}, k = \S\ < M and S C 
{1, • • • ,K}. The transmit signal vector x is a linear combination of all selected users' data 
streams, constructed as 

x = ^w iV fe, (2) 

where G C Mxl is the beamforming weight vector, pi is the transmit power scaling factor and 
Si is the information symbol of user i. We can rewrite © as 

y k = (h k w k ^/pj:)s k + ^ (hfeW iv ^)si + n k . (3) 

Finding the optimal beamforming weight vector Wj is a difficult non-convex optimization 
problem [3]. We utilize ZFBF, which is easy to implement and has comparable performance 
with DPC A3), to determining the beamforming weight vectors in this paper. 

C. Zero-Forcing Beamforming 

ZFBF inverts the channel matrix at the transmitter in order to create orthogonal channels 
between the BS and users. ZFBF completely removes the interference among different users at 
the BS, i.e., 

hjWi = Sij, i,j G S . (4) 
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Therefore, w* must lies in the orthogonal complement of the subspace Vi = span{hj\j G S,j ^ 
i}, denoted it as V/-, where Vi is spanned by the channels of all the other selected users lfl8ll . 
The orthogonal projector matrix on V^~ is 

Pi = Im - H 5\{i}( H '5\W H 5\{i})~ lH 5\{i} > ( 5 ) 

where Im is the M x M identity matrix, and H^u*} is the row -reduced channel matrix of all 
the selected users except user i. Suppose n{l) = i, we have 

Hs\{i} = [h* (1) , • • ■ , h^ ( j_ 1) , h* ( ; +1) , • ■ ■ , h* (fc) ]* . (6) 

Since ZFBF is a linear precoder that maximizes the output SNR subject to the constraint that 
does not interfere with all other streams IIT91 . according to the orthogonal condition © we have 


/ h P 1 \* P x h* 

Define 

Vi = hiPi . (8) 

The Vi can be interpreted as the effective channel vector (ECV) of user i . The ECV Vi is the 
component of orthogonal to Vi and the module square of Vi equals to effective-channel-gain 
Aj as we will prove later in (TTTb . Fig. \T\ shows an example of ECV for user 1 and 2 when the 
selected user set S = {1,2}. According the definition in ([8]), we have i/jh* = for all i ^ j , 
i,j E S and Vi changes with selected user set S that its module decreases when S been added 
with more users. The beamforming weight vector Wj can be rewritten as 

h P^h* Hi,. II 2 ' 



The received signal for user i is then given by = y/piSi + ni, and the maximum achievable 
ZFBF sum rate for the user set S is the sum of individual rates 

R(S)= max Vlog(l+ ft ), (10) 



where 



\ = = II^H (U) 
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is the effective-channel-gain of user i O, X^Pi is the transmit power allocated to user i, and 
Pi is the received SNR of user i. By using Lagrangian method, the optimal p { in (flOl) is found 
by waterfilling power allocation 

p t = ( l x\ i -l) + = (pWuif , (12) 

where (x) + denotes max{x,0}, and fx is the water level satisfing 

5^(//-ihir 2 ) + = p. (B) 

i&S 

Note that there is another simple explicit formula for the beamforming weight vectors: w f (i) 
is the i-th column of the Moore-Penrose pseudo-inverse of the channel matrix H5, defined 
by H f = H^HsHJ)" 1 , i.e., = [w^i), • ■ ■ , w^*)]. According to ® and (HI]), we have 

H t = [ <w .... ,£a>]. (14) 

<V(1) ^TT(fc) 

Z). 5wm rate maximization with user selection 

The sum rate (flOl) of ZFBF can be further optimized with respect to the selected user set S. 
Thus, the user selection problem can be formulated as 

maximize R(S) 

subject to S C {1, • • • , K} . (15) 

This is a fundamental question in multi-user MEVIO communication, but determining the 
optimal S best in (fTBI) requires an exhaustive search over all possible user sets. The size of the 
search space is Y^Li niK-iY. ' w hich increases exponentially with M. It is prohibitive for prac- 
tical implementation. Many suboptimal user selection strategies had been proposed to approach 
the upper bound set by exhaustive search. A major class of ZFBF user selection method is 
the incremental heuristic search method El-flU, [fl6l . ffTvTl . represented by the ZFS algorithm 
proposed in [0. 

III. Flaws in Previous Greedy User Selection Algorithms 

In this section, we study the problems in a typical greedy user selection algorithm represented 
by ZFS. ZFS is initialized with the user with the maximum channel norm. In each iteration one 
user is added to the selected user set such that the sum rate increment is maximized. The 'add' 
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operation continues until no positive sum rate increment can be achieved. The essential recursive 
user set updating step of ZFS is 

7r(n) = max R(S n ^i U {u}) 

ueu\s n -! 

S n = S' n _ 1 U{7r(n)} ) (16) 

where U = {1, • • • , K} is the index set of all users, 7r(n) is the index of selected users in the 
n-th step and S n is the updated index set after adding the selected user %(n). Suppose the output 
of ZFS user selection procedure is S ZFS . 

Let U n denote the index set that maximizes the sum rate among all user sets with cardinality n, 
i.e., U n = argmax Scc/ n R(S). The essential idea behind (fT6l) is trying to obtain U n based on 
U n -\ by adding a new user. However, since U n may not be the superset of ?7 n _i, i.e., C/„_i U n , 
as we will see later in Fig. |2l the S n selected by ZFS may not be identical to U n except when 
n = 1. Furthermore, the S ZFS may not be S best because the optimum S best in (TT5T) achieved by 
exhaustive search should satisfy S best = argmax 1<n<M R(U n ). The typical flaws in the output 
of ZFS S ZFS include following two aspects. 

A. Redundant user 

Because the greedy incremental user selection considers only the influence of selected users, 
but not including the influence of user yet to be selected, a previously selected user might become 
a redundant user when new users are added. This phenomenon has been partially discovered in 
and 0, where they found the existence of redundant users when some users i E S been 
assigned with zero transmit power, i.e., pi = 0, after waterfilling power allocation. The redundant 
user situation is handled by deleting users with p$ = 0, and the obtained result is viewed as 
'optimal beamforming vector' in []5). However, as we will prove in the following, there are more 
to be discovered in both identifying and handling the redundant users. 
1. Redundant users might exist even if > for each selected user 

The condition p ri = is sufficient but not necessary for the user i £ S to be redundant. Its 
sufficiency had been proved in both [2] and [0 that the sum rate will increase after deleting 
users with pi = 0. It is, however, not a necessary condition, which will be demonstrated in the 
following. 
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Let 



1 



0.65 







H 



0.46 



1 



0.46 



(17) 
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be a channel matrix instant between a three-antenna BS and three single- antenna users. The 
sum rates for user sets {2}, {1, 2}, {1, 3} and {1, 2, 3} under different sum transmit SNR P are 
shown in Fig. [2l 

The user set found by exhaustive search, S best , varies with transmit SNR P that S best = {1,3} 
for dB < P < 34.85 dB and S best = {1,2, 3} for P > 34.85 dB. The user selection procedure 
of ZFS algorithm and S best at different transmit SNRs are listed in TABLE H 

According to TABLE U the initially selected user {2} is a redundant user for S ZFS when the 
transmit SNR is 27.13 < P < 34.85. However, the transmit power of the user 2 is not zero. Taking 
P = 27.14 dB as an example, the transmit power distribution is X^pi = A3 p 3 = 22.42 dB and 
X^P2 = 22.26 dB, indicating that a redundant user exists even if pi > for each selected user. 
In fact, as we will show in Section IVI-Bl the case of redundant users with pi = does not exist 
when the ZFS algorithm is utilized to determine the user set. 
2. Deleting users with Pi = cannot guarantee the maximum sum rate increment 

Which user should be deleted when redundant users exist in the selected user set? An intuitive 
method is to delete the user with the smallest effective-channel-gain Aj, which corresponds to the 
user with Pi = when a non-positive power allocation exists. However, the sum rate is affected 
by transmit SNR, channel norm, and channel correlation of selected users while the effective- 
channel-gain Aj only represents partial influence of channel norm and channel correlation. We 
have the following lemma. 

Lemma 1: When a redundant user exists in the selected user set, deleting the user with Pi = 
increases the sum rate but cannot guarantee the maximum sum rate increment. 

Proof: See the Appendix. □ 

B. Local optimum S n 7^ U n 

Define the neighborhood of S n as the set obtained by adding or deleting one user from S n . 
The output of ZFS may fall into a local optimum, i.e., the sum rate of S ZFS cannot be increased 
by adding or deleting one user but is still not the global optimum. As shown in Fig. |2] and 
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TABLEU when 10.48 dB < P < 27.13 dB we have S ZFS = {1, 2} and S best = {1, 3}. The sum 
rate of S ZFS = {1,2} cannot increase by adding a new user 3 or by deleting the selected user 
1 or 2, but S ZFS 7^ S best . We noticed, however, that the global optimum S best can be achieved 
from S ZFS by swapping user 2 with user 3. 

We can leave the local optimum through 'swap' operation on the user set S ZFS . However, there 
is a tradeoff between complexity and performance on the selection of 'swap' operation. When all 
possible 'swap's are allowed (one-for-one, one-for-many and many-for-one), the complexity is 
the same as exhaustive search. In this work, for the simplicity of implementation we considered 
only the one-for-one swap. Although it cannot guarantee the global optimum, the complexity 
will be greatly reduced. And we will show later that in most cases the sum rate optimum can 
be achieved by using one-for-one swapping. 

According to the above analysis, to solve the flaws of traditional incremental greedy user 
selection algorithm we need 'delete' and 'swap' operations on the selected user set. Determining 
the best user to 'delete' or the best user pair to 'swap' requires sum rate comparison among all 
possible deleted or swapped user sets. According to (fi~0T)- ([l"3l . calculating the sum rate involves 
a Moore-Penrose pseudo-inverse which brings significant amount of complexity. In order to 
reduce the algorithm complexity, the recursive of (H S H^) _1 was used in [2] and the LQ 
decomposition of H s was used in [5] to calculate the effective-channel-gain A and the sum 
rate without calculating Moore-Penrose pseudo-inverse. However, the iteration methods in []2] 
and [5] only support adding a new user to the selected user set; they cannot be expanded to 
calculate the new sum rate when 'delete' or 'swap' operation is utilized. So, we need a new A 
updating method which can be used to calculate new sum rate after 'add', 'delete' and 'swap' 
operation while maintaining the same level of complexity. A new user selection algorithm will 
be constructed by using the new A updating method in Section |V] 

IV. A Updating Method Based on ECV 

According to (fl0l)-(fT3l. the effective-channel-gain A is the key parameter in calculating the 
sum rate of selected user set S. All the previous complexity reduction methods in ZFS and SWF 
update A through iteratively updating and are only applicable when a new user is added 
to S. To construct a method suitable for 'add', 'delete' and 'swap' operation, we designed an 
efficient A updating strategy that is based on iteratively updating ECV v defined in © instead 
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of to reduce the complexity. 

Let U — {1, ■ • • , K] be the index set of all users and S be the index set of the selected user 
set. The proposed A updating strategy involves two classes of parameters, which correspond to 
the users in S and U \ S respectively, as follows: 



• The orthogonal component of channel vectors gj of the remain user j E U \ S, which is 



We need to update these two classes of parameters after each 'add', 'delete' and 'swap' operation 



Vi and gj under three operations are illustrated from both algebra and geometry perspectives in 
the following. 

A. Add a new user 

Suppose a new user k E U \ S is added into the selected user set S, and denote the new 
user set as S + where S + = S U {k}. The ECV Ui of the selected users i E S and the gj of the 
remaining users j E U\S are known. We need to calculate the updated i/+ of users i E S + and 
gj + of users j E U \S + . 
1) Update uf 

Since S + \ {k} — (S U {k}) \ {k} = S, the ECV of the new added user k can be calculated 
according to (fT8T ) (fT9T ) as 



. The ECV Vi of the^eleq^(J]U|er^^, } ^ H grd^^o { .jP^^)we have 



(18) 




(19) 



1 1 2 

to get the new effective-channel-gain A = \\v\\ for the new user set. The updating strategies of 




(20) 



gfc • 



As for the other users i E S + \ 



{k}, or i E S, we have 



H s\W 
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After some algebraic manipulation, we obtain 
< + -h, (l 



H s+\{i}( H s + \W H s+\{«}) lR s+\{i} 



"S\{i} ll k 



H S\{i} H S\{»} H S\M h fc 



h fc H 



S\{i} 



h fc h 



-1 










) 









(21) 



i/, I 



KMS + \ {»}) 
K(s+\«)ll 2 



where v k (S + \ {i}) = h k (l M - H^ {i} (H s \ {j} H* Ui} ) ^s^J is the ECV of user k when the 
selected user set is S + \ {i}. Since 

g fc = h fc (I M - H^HsH^Hs) 



h k I 



H 5\{i} h i 



-T -1 



H S\{i} H S\{»} H S\» h * 



h;H 



S\{i} 



h t h* 



H 5 \{i} 



(22) 



^(^ + \ {0) - 



2 * ' 



according to (1221 we have 



u k (S + \{i}) = g k + ^lu i . 

\\vi\\ 



(23) 



Plugging (HI) into dZB, we get 




(24) 



J2&fc 



Since g k _L i/f, the effective-channel-gain is 



I l|2 II ||2 
gfc 



|2 - 



(25) 



As shown in Fig. |3j the derivation of vf from (l2TT ) to (|25l ) can also be explained from geometry 
perspective. Since vf is the component of orthogonal to the subspace Vf' = span{hj\j E 
S + ,j 7^ i}, and Vi and vf are orthogonal to the subspace Vi = span{hj\j G S,j ^ i}, vf can 
be calculated by the component of Vi orthogonal to v k (S + \ {i}), which is the projection of 
h fc on the subspace span { v^ vf} as shown in Fig. [3l Note that span{Vi, vf} is the subspace 
orthogonal to Vi = span{hj\j 6 S,j ^ i}; Vi and v k (S + \ {i}) are the orthogonal components 
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of hi and h fc projected onto the subspace V{. Supposing the angle between Vi and uf is 9, we 
have 



: ,|2 II ||2 

I"* ll^ll llgfcl 



cos # = 7i 7771 \ r -1 \ n = \ / n 7T7 75 n 79 (26) 

K(s+\«)ll V INI 2 l|g*ll 



A+ = \\u l \\ 2 cos 2 9 = X iC os 2 9 (27) 
vt = ^-^g^cos 2 ^. (28) 

2) Update gt 

According to (fT9i we can calculate the updated g+ with the same method as in (f2TT)-(f24l) for 
the users j E U \ S + . However, since g+ is the component of hj orthogonal to the subspace 
V + = span{hi\i E S + } and gj is orthogonal to the subspace V = span{hi\i E S}, we can find 
g+ via Gram-Schmidt orthogonal procedure by projecting gj onto orthogonal complement of 
the vector u, where u _L V and V + = span{V, u}. According to former analysis, u = v\ = g fc , 
so 

gt = gj - ^|g, . (29) 
II gfc II 

for the users j E U \ S + . 

In summary, the updated vf of users i E S + and g+ of users j E U\S + are listed as follows: 

i E S 

u + = j Aiiigfeir+iii/itijn \ iiB*N / (30) 

i = k 

g+ = Sj-f^kSk, jEU\S\{k}. (31) 
II gfc II 

B. Delete a selected user 

Suppose the user k E S is deleted from the selected user set S, and denote the new user set 
as S~ where S~ = S \ {k}. We need to calculate the updated for users i E S~ and updated 
gj~ for users j E U \ S~. 
1) Update v~ 

The ECV v~[ is the component of hj that is orthogonal to the subspace Vf = span{hj\j E 
7^ i}- Since Ui J_ Vi and v k _L Vf , where V = span{ hj\j E S,j ^ i} = span{V~ ,u k }, 
the ECV can be expressed as the projection of hj on the subspace span{ui,u k }. This is 
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equivalent to solving v { when knowing uf and v\ in Fig. |3j where Vi is the projection of hj on 
the subspace span{uf,u^}. Thus, we have lfT8l 



V; 



h; 



iHvl 



v k v\ v k v\ 

1 



-1 













kill 2 IKII 2 - ll^fcll 2 



-u k v* v t v 





Vi 




v k 



(32) 



\ v k\ 



i i- ii i- 



\ v i v k\ 



II I- 



v k 



The second equality holds because v k _L V k , where V k = span{hj\j e S,j ^ k}, thus, hiV* k = 0. 
The third equality holds because h;l/* = h i (P- L )*hJ = hiP^-(P^)*hJ = |K||*, where Pf- = 

" 1 Hs\{ij is an idempotent Hermitian matrix that (P^ - ) 2 = P^ and 



H 



S\{i}( H S\{i} H S\{t} 



(PIT 



According to (|321) . the effective-channel-gain Aj for users % is 

|2 



a; 



A; 



i ||2 n 1 1 2 

Will Wk\\ 



I 1 1 2 ii 1 1 2 

Wk\\ 



(33) 



1^1 



The above deduction for i/~ can also be explained from the geometry perspective as shown in 
Fig.Sl The v7 t is in the subspace span{Vi,v k } and orthogonal to v k . Suppose the angle between 
vj and v , is 9, we have 



cos# 

K 



Vl - sin 2 # = 4 /l- 
||i/J 2 cos~ 2 6» = A,;cos~ 2 # 



Vi" k \ 



II- II II 2 



^^)cos- 2 



(34) 

(35) 
(36) 



2) Update 

The deleted user k is now moved from the previously selected user set S to the remaining 
user set U \ S~ . Since S~ = S \ {k}, g k can be calculated according to (fT8T ) ([T9T ) as 

g k = h k (I M - Hj-CHs-H^)- 1 ^-) 

= h k [l M - H^ {A , } (H5\ {fc }H^ {fc} ) _1 H 5 \ {A .}) (37) 
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As for the other users j E (U \ S~) \ {k}, or j E U \ S, we can update the gj according to 
Gram-Schmidt orthogonal procedure. Since gj is the component of hj orthogonal to the subspace 
V~ = span{hi\i E S~}, gj _L V and v k J_ V~, where V = span{hj\j E S} = span{V~ ,v k }, 
the updated gj can be expressed as the combination of gj and the projection of hj on u k , i.e., 



+ 



II I- 



(38) 



for the users j G U \ S. 

In summary, the updated vj of users i G S~ and gj of users j G U \ S~ are listed as 
following: 

AjAfc 



8; 



AjAfc — \viV* k 

=»J ^ A, 



ffc, 



u k , jeU\S 
j = k 



(39) 
(40) 



C. Swap wsers one-for-one 

Suppose a new user £ G U \ S is swapped with a selected user k E S, and denote the new user 

s s s 

set as S where 5 = (S 1 U {/}) \ {k}. We need to calculate the updated v\ for users i E S 

S 

and updated for users j E U\S . 

Since the one-for-one user swap is a combination of adding a new user and deleting a selected 
user, the corresponding v\ and gj updating algorithm can be obtained by sequentially applying 
the 'add' and 'delete' updating algorithm, as defined in (|30l)(f3TT) and (I39l)(l40l) . Assume adding 
user I first and then deleting user k. Denoting the intermediate results as u i+ and g + , we have 



Va 



A! 



IKI 


2 II 
IK 


||2 




IKI 


IK 


|2 


"<4 


V* 

k + 




IK II 


4 II 

IK 


|2 




IKI 


1 IK 


|2 




V* 

k + 



_ ,+ fc+ 

i+ II 1 1 2 fc+ 



fc+ | 



2 ' 



i G S" 



IK+ 



(41) 



(42) 



(43) 
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where 



A,||g;|| 2 "i^ 

— — TFTi — 77773' I Vi — |j— J™ 

^ _ y Ai||gj|| +||fihj || v Usui / (^44^ 

Si, i = l 



S J+ = Sj-j%Si, je(U\S)\{l}. (45) 
\\Si\\ 

Note: we can also get the same v\ and by first deleting user k and then adding user I. The 
expressions are similar to (j4TT> - (|43T > with the same complexity and thus omitted for the sake of 
space. 

V. GUSS Algorithm 

A new greedy user selection algorithm, which utilizes the ECV-based A updating strategy in 
Section IV, is proposed in this section. The algorithm is called greedy user selection with swap 
(GUSS) algorithm as it includes 'add', 'delete' and 'swap' operations. 

The GUSS algorithm works as follows: it initializes with ZFS, i.e., adding one user with the 
maximal AR in each step consecutively until the maximal AR < 0; it then deletes one user 
at a time, each deletion produces maximal AR, until no sum rate increment is possible. GUSS 
oscillates between 'sequential add' and 'sequential delete' until AR < for both operation. One 
'swap' operation is then invoked to boost the sum rate. After the 'swap', GUSS goes back to 
the oscillation of 'add' and 'delete', attempting to further increase the sum rate. If AR < for 
any user choice, the user selection procedure finishes. The construction and complexity analysis 
of GUSS algorithm are outlined next. 



A. Construction of GUSS algorithm 

Let U = {1, • • • , K} be the index set of all users and S be the index set of the selected 
user set. The Vi and A; are the ECV and effective-channel-gain of selected user i E S, and 
gj for j E U \ S is the component of remaining channel vectors orthogonal to the subspace 
span{hi\i E S}. 
Step 1) Initialization: 

gj = hj for all user j E U . 
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Step 2) Add a new user: 



A+( w ) = { A " l|gm|1 +ll"* h lll*' " (46) 
||g™|| 2 , « = w 

= arg max R(S U {w}) . (47) 

Let Ai? = R(S U {&}) - R(S). If Ai? > 0, S <- S U {A;}, update g 3 and corresponding A, 
according to (l30l)(l3"TI) and then go to step 2); if AR < for one iteration, go to step 3); else if 
AR < for two consecutive iterations, go to step 4). 
Step 3) Delete a selected user: 

KH= : : ,„2 . iG^\W (48) 

A,A„, - || 

fc = argmaxi?(S'\ {w}) . (49) 

Let Ai? = R(S \ {k}) - R(S). If Ai? > 0, S <- S \ {k}, update u h gj and corresponding A* 
according to (|39l(|40l ) and then go to step 3); if AR < for one iteration, go to step 2); else if 
AR < for two consecutive iterations, go to step 4). 
Step 4) Swap users one-for-one: 



I II' 1 II II 2 

Xi(k,l) = 2 r+J ™" s- / 6 .S'U {/} \ {/>} (50) 

11^-^,11 11^,11 — v 

II •+>' 1 1 II + 1 1 ! +. ; fc+,! 

i/., . = ^ ^llaf+ll^hfH V, ||s, || 7 (51) 



i = z 



{M} = arg max . U {/} \ {fc}) . (52) 

kes, leu\S 

Let A_R = R(S U {/} \ {k}) - R(S). If Ai? > 0, S <- S U {1} \ {k}, update v u gj and 
corresponding Aj according to (l4TI)-(|43l and then go to step 2); if AR < 0, go to step 5). 
Step 5) Precoding matrix: 



i 



^-v*. .... 1, (53) 



A (l) W A (2) W ' A („) (») 



where n = IS 1 ), i/,., and A,., are the ECV and effective-channel-gain of the i-th user in S, and 
// = (P + Xlie5 -V 1 ) Ai is the water level for power allocation. 

GUSS initializes with empty user set 5 = 0. The first selected user is the one with the maximal 
effective-channel-gains Xf(w) which is equivalent to the maximal square channel norm ||/^|| 2 
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for S — 0. GUSS repeats the add operation in step 2) sequentially, and the procedure before it 
goes to step 3) for the first time constitutes the user selection of ZFS algorithm. 

For each 'add', 'delete' and 'swap' operations in step2) to step 4), the updated effective- 
channel-gains is calculated first and then used to evaluate the updated sum rate with waterfilling 
power allocation. To further reduce the complexity, we can eliminate the iterative waterfilling 
procedure that is involved in (|47T ) (|49T ) (|521 by restricting the candidate user or user pair to the 
ones that provide positive transmit power for all users in the updated user set. Take (|47l) in step 
2) as an example, from the properties of waterfilling, this holds if 

!»l±i < P+ y _J_ (54) 

min iesuM \+(w) \+(w) 
If ((541) is satisfied, the corresponding water level can be calculated directly through 

1 1 \ i&SU{w} 1 K ' ) 

Similar inequality can be achieved for step 3) and 4). According to our simulations, this search 
space pruning operation does not compromise sum rate at all. 

The searching space in step 4) can be further reduced by K — \S\ or \S\, because it provides 
non-positive sum rate increment if the last added or deleted user is involved in the one-for-one 
swap. The calculation of all Xf(k,l) in (l50l) involves K — \S\ partial u. + l s updates as in (l5TT) if 
it adds user I first. If we calculate the Xf(k, I) by first deleting user k, the corresponding Af (k, I) 
updating involves | jS' j partial v. k s and fc s updates as 

I"- .'.irils' i E S \ {k} 



I 2 II || 2 h ,,.112) 



X^kJ) = { W^-Ml +p-* h t\\ (56) 



n-,k 



I 



g 3 -, fc = S, + ^ fc » j£U\S. (58) 
Xk 

In step 5), the precoding matrix in (1531 is the result of ZFBF and waterfilling power allocation. 
According to ©, the precoding matrix can be written in the form 

[ w (d^ ••• > \)V^1' (59) 
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The transmit power scaling factor is 

P M = - 1 > • (60) 

for all users because all the selected users of GUSS will be allocated with positive transmit 
power. If not, the sum rate can be increased by 'delete' operation, which is contradictory to the 
fact that the user set output S GUSS of GUSS cannot be increased by 'add', 'delete' or 'one-for-one 
swap' operation. By plugging © and (|60l) into (l59l ), we got the precoding matrix (l53l ). 

By construction, GUSS provides a sum rate higher than or equal to the one achieved by 
ZFS because the selected user set S is improved by allowing 'delete' and 'swap' operations on 
the basis of ZFS. To distinguish the source of performance improvement, we constructed here 
another user selection algorithm that only allows 'add' and 'delete' operations, named greedy 
user selection without swap (GUS-nS) algorithm. GUS-nS removes the swap operation in step 4) 
of GUSS; therefore, the user selection process finishes if AR < for two consecutive iterations 
in step 2) or step 3). So, GUS-nS improves ZFS by only by eliminating the redundant users 
without handling the local optimum flaws. 

B. Complexity analysis 

The computational complexity of the proposed algorithm includes two parts: 1) user search; 
and 2) i/j, gj and A, update. We focused on the complexity of user search as i/j, gj and A; 
updating stage has fixed complexity and is negligible when compared with user search. Let 
n = \S\ denote the cardinality of S. The complexity of each step is calculated as follows. 

• For a given S in step 2), the GUSS algorithm evaluates K — n rates R(S U {w}). The 
evaluation of R(S U {w}) is split into the evaluation of \f(w) followed by evaluation 
of fx according to (flOl . The evaluation of all Xf(w) for i E S U {w} requires n vector- 
vector multiplications and n + 1 vector 2-norms (vectors are 1 x M), and thus has M{2n + 

1) multiplications. Repeating this over K — n remain users, we obtain the user search 
complexity in step 2) as M(K — n)(2n + 1) multiplications. 

• For a given S in step 3), the GUSS algorithm evaluates n rates R(S\{w}). Similar to step 

2) , the evaluation of \^(w) for i £ S \ {w} involves M(2n — 1) multiplications. Repeating 
this over n selected users, we obtain the user search complexity in step 3) as Mn(2n — 1) 
multiplications. 
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• For a given S in step 4), the GUSS algorithm evaluates Kn — n 2 rates R(S U {/} \ {k}). 
Suppose A|(fc, Z)s are calculated according to (j50T >(l5TT>. i.e., 'add' precedes 'delete' in a 
'swap'. The user search involves 2Mn 2 + 3Mn + M multiplications for each group Af (/c, l)s 
with k E S, and (2Mn 2 + 3Mn + M) (K — n) complex multiplications for all. The user 
search involves 2MKn 2 — 2Mn 3 + 3Mn 2 — 3Mn complex multiplications if Xf(k, l)s are 
calculated according to (l56l)-(l58T). i.e., 'delete' precedes 'add' in a 'swap'. However, they 
all have the same level of complexity O (2Mn 2 (K — n)). 
The total complexity of GUSS in step 2) is approximately X]*=i M(K — n){2n + 1), which 
is 0(KM 3 — |M 4 ). Suppose the number of iterations in step 3) and 4) is b and a respectively, 
which will be shown to be small numbers in next section. The total complexity of GUSS 
in step 3) and 4) are 0(2bM 3 ) and O (2aKM 3 - 2aM A ). So, the complexity of GUSS is 
O ((2a + 1)KM 3 - (2a + § )M 4 ), and the complexity of GUS-nS is 0(fsTM 3 -|M 4 ). When the 
number of users K > M, the complexity of GUSS and GUS-nS is simplified as O ((2a + 1)KM 3 ) 
and O (KM 3 ), respectively. Since the complexity of both ZFS and SWF is O (KM 3 ), the GUS- 
nS has the same complexity with ZFS and SWF, and GUSS has 2a + 1 linear complexity 
increment. However, as it will be shown in next section, both GUSS and GUS-nS outperform 
ZFS and SWF in terms of achieved sum rate. 

VI. Simulation Resultes 

In this section, we present the numerical performance comparison among GUSS, GUS-nS, 
ZFS, SWF, SUS and exhaustive search. The achieved sum rate R(S) and the number of selected 
users | S | of those algorithms under different K and P, averaged over channel distribution , are 
compared in the following. 

A. Number of users 

The simulated multi-user system has M = 10 transmit antennas at BS, transmit SNR P = 
15 dB, and the number of users K ranges from 8 to 20. All curves are obtained by averaging 
over 10 4 independent complex-valued channels, drawn from i.i.d. Rayleigh distribution with 
unit- variance for each channel entry. 

Fig. |5] shows that the throughput of all algorithms grows with the number of selected users. 
The reason encompasses two parts: first, the larger K provides the higher multiuser diversity gain 
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as there is more likeliness to select a user set with strong channel norm \h\ and effective-channel- 
gain A; second, the larger K provides the higher multiplexing gain because the cardinality of 
selected user set increases with K as shown in Fig. [6J 

The exhaustive search achieves the highest throughput of all user selection algorithms, which 
is followed sequentially by GUSS, GUS-nS, ZFS and SUS. The SUS is simulated with carefully 
chosen threshold a = 0.44, which is optimum choice for K = 13, while the optimum a ranges 
between 0.41 and 0.52 when K changes from 20 to 8. ZFS achieves considerable higher sum 
rate than SUS as it guarantees sum rate increment in each step of user selection. 

To reveal more details on the performance of GUSS and GUS-nS algorithm, the ratio of 
eliminating redundant user and escaping from local optimum of these two algorithms, which 
corresponds to the ratio of user selection instant with effective 'delete' and 'swap' operation 
that increases sum rate, is presented in Fig. [7J GUS-nS achieves higher throughput than ZFS, 
0.04 bps/Hz increment over ZFS for K = 14, by eliminating redundant users in S ZFS that the 
cardinality of selected user set |S , GC/s _ nS | < \S ZFS \ as shown in Fig. |6] In average, 5.0% of S ZFS 
contains redundant users according to Fig. [7J 

GUSS achieves further throughput increment over ZFS, 0.43 bps/Hz increment over ZFS for 
K = 14, by eliminating redundant users and escaping from local optimum in S ZFS simulta- 
neously. It selects a user set with larger cardinality, IS^^I > |5 Zi?s |, as shown in Fig. [6] It 
indicates that more effective 'add' operation with AR > is conducted after 'swap' operation, 
because only 'add' enlarges user set and 'swap' operation does not. According to Fig. [7J 40.1% 
of S ZFS is trapped in local optimum in average and the ratio increases with K. The ratio of 
eliminating redundant user is 7.1% in GUSS, which is higher than that in GUS-nS because the 
add operation after swap in GUSS will introduce more redundant users. GUSS achieves a higher 
sum rate and cardinality of user set than ZFS but still lower than exhaustive search as only 
one-for-one swap is used in GUSS. 

B. Transmit SNR 

The achieved throughput and the cardinality of selected user set are both increased with the 
transmit SNR P, with the same trend as with K in Fig. [5] and Fig. |6l for all algorithms except 
SUS. The SUS algorithm selects the same user set under different P because its user selection 
procedure does not take P into consideration. However, SUS achieves higher sum rate at larger 
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P as the sum rate increases with P for the same user set. 

Fig. [8] shows the throughputs of GUSS, GUS-nS, SWF and ZFS algorithms as a fraction of 
the throughput of exhaustive search algorithm at different transmit SNRs P. Fig. [9] shows the 
ratio of channel instants that has redundant user and local optimum encountered in the user 
selection process of GUSS, GUS-nS and SWF algorithms. The simulated multi-user system has 
M = 10, K = 15 and P ranges from dB to 30 dB. All curves are obtained by averaging over 
10 6 independent channels. 

The throughput ratios rank from high to low sequentially are GUSS, GUS-nS, and SWF and 
ZFS. The fraction of the GUSS throughput to the throughput of exhaustive search approaches 
1 when P approaches zero or infinity, and it exhibits a valley in the middle. The same trend 
exists for GUS-nS, SWF and ZFS but it requires higher P for those algorithms to recover from 
the valley. 

SWF has exactly the same sum rate performance with ZFS and the ratio of 'eliminating 
redundant user' for SWF equals to zero for the whole range considered in Fig. [9j There is 
no redundant user with pi = ever happened in one million simulations, which proofs the 
conclusion in Section Hill GUS-nS achieves 98.2% of sum rate upper bound in average, which 
corresponds to 0.1% throughput increment over ZFS, by eliminating 4.2% redundant users in 
S ZFS in average as shown in Fig. [9] The ratio of redundant user increases with P from dB 
to 15 dB and then decreases, because the redundant user existed when P is low will not be 
redundant user any more when P becomes large enough. Such as the example in Fig. |2] user 2 
is a redundant user when P = 30 dB but is not when P increases to 40 dB. 

GUSS achieves 1.7% higher sum rate than ZFS at P = 30 dB since there is at least 63.8% 
of S ZFS trapped in local optimum and 5.2% of S ZFS contains redundant user and they are all 
handled by GUSS as shown in Fig. [9J The gap between GUSS and ZFS increases with P in the 
range shown in Fig. [8] because the possibility of the S ZFS trapped in local optimum increases 
with P. At the same time, GUSS eliminates 2.7% more redundant user than GUS-nS in average 
because more effective 'add' operation with A_R > is conducted after the 'swap' operation 
in GUSS, which turns more users to redundant user. In average, there is 6.9% channel instants 
involve redundant user and 43.1% channel instants are trapped local optimum in the process of 
GUSS. According to Fig. [8] GUSS achieves 99.3% of sum rate upper bound averaged over the 
SNR range considered. 
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C. Complexity of GUSS 

GUSS provides considerable throughput increment over ZFS by adding the 'delete' and 'swap' 
operations which introduce a 2a + 1 linear complexity raise. The number of swap operation a 
is influenced by K, M, P and H. Fig. [10] shows the averaged a for different number of users 
K ranging from 10 to 40 at P = 15 dB and M = 5, 10. Fig. \TT\ shows the averaged a for 
10 < K < 40, OdB < P < 30 dB at M = 10. All curves are obtained by averaging over 10 4 
independent channels. 

For all M and K considered in Fig. [Kjl a stays between 1.4 and 1.85 which implies that GUSS 
has four to five times complexity of ZFS. GUSS has more swap operations at M = 10 than at 
M = 5 for each specific K when K > M. The fact that system with larger M selects more 
users implies that the larger possibility S ZFS been trapped in local optimum. The a decreases 
with K when K > 30 for M = 10, and K > 25 for M = 5. Because the selected users 
are almost orthogonal with high probability when K is large enough, it requires smaller K to 
achieve near-orthogonal user set for smaller M antennas in BS. 

The a stays between 1 and 2.5 for the K and P range considered in Fig. [CD which implies that 
GUSS has only three to six times complexity of ZFS. The a increases with K before saturated 
for given P, and it needs smaller K to achieve the maximum a at larger P. a also increases with 
P before saturated and then decrease with P, because the number of selected users increases 
with P and saturated when P is large enough. The a equals to 2.04 at P = 30 dB, K = 15 and 
M = 10, which corresponds to about five times complexity of ZFS for GUSS. 

VII. Conclusion 

We have discovered two flaws in traditional greedy user selection in multi-user MIMO down- 
link with ZFBF: 'redundant user' and 'local optimum'. While traditional greedy user selection 
methods only use 'add' operation during the update of the selected user set, the proposed GUSS 
algorithm allows 'delete' and 'swap' operations to eliminate redundant users and helps escaping 
from the local optimums. An ECV based effective-channel-gain A updating method for 'add', 
'delete' and 'swap' user operation is designed to reduce the complexity of GUSS. The GUSS 
provides considerable throughput increment with only 2a + 1 linear complexity increase, where a 
is the number of swap operations for specific realization and it stays between 1 and 2.5 according 
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to our simulation results. Simulation results verify the improved throughput performance and 
low complexity. 

The GUSS algorithm proposed in this paper achieves 99.3% of the upper bound throughput 
performance; it is significant for multi-user MIMO downlink transmission. And the novel ECV 
based efficient channel gain A updating method is a useful component to build more delicate 
user selection algorithms, such as the decremental user selection algorithm proposed for massive 
multi-antenna system in ll20ll . The work in this paper can be extended in several ways, including 
considering per-antenna transmit power constraint, multi-antenna users, partial CSIT, and user 
fairness among users. 

Appendix 

Proof of Lemma 1: Suppose the selected user set with redundant user is S = {1, 2, • • • , n}, 
the ECV and effective-channel-gain of the user i 6 S is Ut and Aj, respectively. Let Ai > A2 > 
• • • > A n and only the user n is allocated with zero transmit power that 

1 1 

< H < 



A n -i A n 

where fi = -^—^ [p + Y^i=i X") * s me wa * er level for S. Suppose deleting user k achieves the 
maximum sum rate among S \ {j}, i.e., k = arg max^s R(S\{j}). The conclusion of Lemma 
1 equals to 

R(S\{n})>R(S) (61) 

and 

R(S\{k}) >R(S\{n}). (62) 

Denote the updated effective-channel-gain of user i after deleting user j E S as Aij_ and the 
corresponding water level as according to (|48l ) we have 

Aij-= - - A,? * J ., 2 , ieS\{j} 
XiXj - \\UiU*\\ 

and Ajj_ > Aj for all i 6 S \ {j} since AjAj > ||z/ii/*|| 2 . According to (fTOT ). the (I6T1) holds 
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because 



R(S \ {n}) > (l + - j;)hn-) 

>^log(l + (/i-i-)A^ («) 



= R(S) , 

where the first inequality holds as S \ {n} achieves equal or larger sum rate than distributing 
power the same as that in S, and the second inequality holds since Aj n _ > Aj. 

Suppose the transmit power scaling factor of user i in S \ {j} is p if j_ after waterfilling. The 
(|62l ) holds on the condition 

TT J\^> TT J^L. (64) 
, 11 sin 2 c? ifc ^ 11 sm 2 c? jn 

II * II 2 

where is the angle between i/j and i/j that is independent of Aj and Xj, cos 2 6i j = \.\ ■ 
The (|64l) is achievable when the user has stronger channel correlation with the other users than 
that of the user n, i.e., sin 2 9^ < sin 2 9 i>n and deleting user k provides larger ECV increment 
for user i E S \ {k,n} that A^- > A i n _. The throughput increment in users i E S \ {k,n} 
could compensate the throughput loss in deleing the user k. 
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TABLE I: Comparison between ZFS user selection and exhaustive search 
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Procedure of ZFS 


U ZFS 
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< P < 10.48 


Si = {2} 
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{ 1,3} 


10.48 < P < 27.13 


Si = {2},& = {1,2} 


{1,2} 
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27.13 < P < 34.85 


Si = {2},S 2 = {l,2},S 3 = {l,2,3} 


{1,2,3} 


{1,3} 


P > 34.85 


Si = {2}, & = {1,2}, ft = {1,2, 3} 
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Transmit SNR (dB) 



Fig. 2: Sum rate versus transmit SNR for different selected user sets 
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K: Number of users 



Fig. 6: Cardinality of selected user set comparison of GUSS, GUS-nS, ZFS, SUS and exhaustive 
search algorithms with M — 10 and P = 15 dB 
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Fig. 7: Ratio of GUSS and GUS-nS algorithms 'eliminate redundant user' and 'escape from 
local optimum' with M = 10 and P = 15 dB 
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Fig. 8: Throughput fractions of GUSS, GUS-nS, ZFS and SWF algorithms over the throughput 
of exhaustive search with M = 10 and K = 15 
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Fig. 9: Ratio of GUSS, GUS-nS and SWF algorithm 'eliminating redundant user' and 'escaping 
from local optimum' with M — 10 and K — 15 




Fig. 10: Number of swaps in GUSS for different number of users K at P = 15 dB and M — 5, 10. 
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Fig. 11: Number of swaps a in GUSS for 10 < K < 50 and dB < K < 30 dB at M = 10 



